Journal: NeuroImage
Article Title: Evaluating the reliability of neurocognitive biomarkers of neurodegenerative diseases across countries: A machine learning approach
doi: 10.1016/j.neuroimage.2019.116456
Figure Lengend Snippet: Standardize (grey box) : data were standardized by converting them to z-scores, so that each feature in the control group had a zero mean and standard deviation of one. Data exploration (light orange boxes) : these procedures were used only to explore and obtain knowledge about the behavior of the data. Clustering: we used a k -means algorithm with k = 2 to separate groups in two clusters to explore data distribution, and evaluate the presence of potential sub-groups of participants (details in ). Visualization and inspection: the hypothetically most informative features (cognitive screenings and atrophy) were inspected by graphing pairs of dimensions for the reference dataset (Country-1) (details in ). Principal component analysis (PCA): We used MATLAB’s default implementation of PCA to explore the most informative combination of features as measured by the explained variance of the data. Classification (light green boxes) : Within-country classification: we implemented a logistic regression classifier with cognitive and brain atrophy features within the Country-1 dataset, given that it was the largest one with full completion of cognitive screenings. To evaluate the performance of this model, we used a leave-two-out cross-validation scheme. Cross-country classification: this was performed to further validate the generalization and prediction power of our findings. The logistic regression classifier was trained with Country-1 subjects and tested on participants from Country-2 or Country-3. Finally, to evaluate the relevance of each feature, after performing the classification with the whole feature set, the procedure was repeated but one-by-one each of the features was omitted in the classification (details in and ).
Article Snippet: We used MATLAB’s default implementation of PCA with a singular value decomposition of feature correlation matrix.
Techniques: Control, Standard Deviation, Biomarker Discovery